As discussed during lectures, some APIs require authorization to provide access, and some not. YouTube API provides some data without authorization that can easily be accessed using a third-package called pafy. The latter provides easiy access to movie data from YouTube once the user provides the url. Code below is not explained line-by-line as it is quite user-friendly and self-explaining. The official documentation on pafy package can be found on their GitHub repository.
The package can be installed using pip as usually:
pip install pafy
The data can be accessed even without using this package, as the API response is a publicly available JSON file. One can find that file as follows: https://www.youtube.com/oembed?url=http://www.youtube.com/watch?v=BGBM5vWiBLo&format=json Just change the id with your movie ID (after watch?v= and before %format=json). Similarly, one can access the XML type of response by just changing the last part (json) to XML.
In [1]:
import pafy
url = "https://www.youtube.com/watch?v=BGBM5vWiBLo"
video = pafy.new(url)
In [2]:
video.title
Out[2]:
In [3]:
video.description
Out[3]:
In [4]:
details = [video.title, video.rating, video.viewcount, video.author, video.length]
print(details)
In [5]:
# downloading the video with best quality
best_video = video.getbest()
best_video.download(quiet=False)
Out[5]:
In [6]:
# downloading the audio of the video with best quality
bestaudio = video.getbestaudio()
bestaudio.download()
Out[6]:
In [7]:
# getting all streams: all possible audio/video extensions
allstreams = video.allstreams
from pprint import pprint
pprint(allstreams)
In [8]:
for i in allstreams:
print(i.mediatype, i.extension, i.quality)
In [9]:
# download a chosen filetype, e.g. m4a
allstreams[-3].download()
Out[9]: